Projecting POS Tags And Syntactic Dependencies From English And French To Polish In Aligned Corpora

نویسنده

  • Sylwia Ozdowska
چکیده

This paper presents the first step to project POS tags and dependencies from English and French to Polish in aligned corpora. Both the English and French parts of the corpus are analysed with a POS tagger and a robust parser. The English/Polish bi-text and the French/Polish bi-text are then aligned at the word level with the GIZA++ package. The intersection of IBM-4 Viterbi alignments for both translation directions is used to project the annotations from English and French to Polish. The results show that the precision of direct projection vary according to the type of induced annotations as well as the source language. Moreover, the performances are likely to be improved by defining regular conversion rules among POS tags and dependencies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Projecting POS tags and syntactic dependencies from English and French to Polish aligned corpora

This paper presents the first step to project POS tags and dependencies from English and French to Polish in aligned corpora. Both the English and French parts of the corpus are analysed with a POS tagger and a robust parser. The English/Polish bi-text and the French/Polish bi-text are then aligned at the word level with the GIZA++ package. The intersection of IBM-4 Viterbi alignments for both ...

متن کامل

Transferring Syntactic Relations from English to Hindi Using Alignments on Local Word Groups

Various works have used word alignments in parallel corpora to transfer information like POS tags, syntactic trees and word senses from source to target sentences. In this paper, we work on the problem of projecting syntactic relations from English to morphologically rich Hindi parallel text. We show the effectiveness of Local Word Groups (LWGs) in simplifying alignments as well as in transferr...

متن کامل

بررسی مقایسه‌ای تأثیر برچسب‌زنی مقولات دستوری بر تجزیه در پردازش خودکار زبان فارسی

In this paper, the role of Part-of-Speech (POS) tagging for parsing in automatic processing of the Persian language is studied. To this end, the impact of the quality of POS tagging as well as the impact of the quantity of information available in the POS tags on parsing are studied. To reach the goals, three parsing scenarios are proposed and compared. In the first scenario, the parser assigns...

متن کامل

مدل ترجمه عبارت-مرزی با استفاده از برچسب‌های کم‌عمق نحوی

Phrase-boundary model for statistical machine translation labels the rules with classes of boundary words on the target side phrases of training corpus. In this paper, we extend the phrase-boundary model using shallow syntactic labels including POS tags and chunk labels. With the priority of chunk labels, the proposed model names non-terminals with shallow syntactic labels on the boundaries of ...

متن کامل

Joint part-of-speech and dependency projection from multiple sources

Most previous work on annotation projection has been limited to a subset of IndoEuropean languages, using only a single source language, and projecting annotation for one task at a time. In contrast, we present an Integer Linear Programming (ILP) algorithm that simultaneously projects annotation for multiple tasks from multiple source languages, relying on parallel corpora available for hundred...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006